EXCAVATOR: a computer program for ef®ciently mining gene expression data
نویسندگان
چکیده
Massive amounts of gene expression data are generated using microarrays for functional studies of genes and gene expression data clustering is a useful tool for studying the functional relationship among genes in a biological process. We have developed a computer package EXCAVATOR for clustering gene expression pro®les based on our new framework for representing gene expression data as a minimum spanning tree. EXCAVATOR uses a number of rigorous and ef®cient clustering algorithms. This program has a number of unique features, including capabilities for: (i) dataconstrained clustering; (ii) identi®cation of genes with similar expression pro®les to pre-speci®ed seed genes; (iii) cluster identi®cation from a noisy background; (iv) computational comparison between different clustering results of the same data set. EXCAVATOR can be run from a Unix/Linux/ DOS shell, from a Java interface or from a Web server. The clustering results can be visualized as colored ®gures and 2-dimensional plots. Moreover, EXCAVATOR provides a wide range of options for data formats, distance measures, objective functions, clustering algorithms, methods to choose number of clusters, etc. The effectiveness of EXCAVATOR has been demonstrated on several experimental data sets. Its performance compares favorably against the popular K-means clustering method in terms of clustering quality and comput-
منابع مشابه
EXCAVATOR: a computer program for efficiently mining gene expression data.
Massive amounts of gene expression data are generated using microarrays for functional studies of genes and gene expression data clustering is a useful tool for studying the functional relationship among genes in a biological process. We have developed a computer package EXCAVATOR for clustering gene expression profiles based on our new framework for representing gene expression data as a minim...
متن کاملGene mining: a novel and powerful ensemble decision approach to hunting for disease genes using microarray expression pro®ling
Current applications of microarrays focus on precise classi®cation or discovery of biological types, for example tumor versus normal phenotypes in cancer research. Several challenging scienti®c tasks in the post-genomic epoch, like hunting for the genes underlying complex diseases from genome-wide gene expression pro®les and thereby building the corresponding gene networks, are largely overlook...
متن کاملImproving the Inference of Gene Expression Regulatory Networks with Data Aggregation Approach
Introduction: The major issue for the future of bioinformatics is the design of tools to determine the functions and all products of single-cell genes. This requires the integration of different biological disciplines as well as sophisticated mathematical and statistical tools. This study revealed that data mining techniques can be used to develop models for diagnosing high-risk or low-risk lif...
متن کاملImproving the Inference of Gene Expression Regulatory Networks with Data Aggregation Approach
Introduction: The major issue for the future of bioinformatics is the design of tools to determine the functions and all products of single-cell genes. This requires the integration of different biological disciplines as well as sophisticated mathematical and statistical tools. This study revealed that data mining techniques can be used to develop models for diagnosing high-risk or low-risk lif...
متن کاملEvaluation of the Effect of Curcumin and Imatinib on BCR-ABL Expression Gene in Chronic Human k562 Cells
Background and Aims: Detection of overexpression in tumor-inhibiting genes provides valuable information for leukemia diagnosis and prognosis. Chronic myeloid leukemia (CML) is a stem cell disorder determined by a well-defined genetic anomaly involving BCR-ABL translocation in the Philadelphia chromosome. Curcumin is a chemo-preventive agent for the primary cancer targets, such as the breast, p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003